Dataset statistics
| Number of variables | 31 |
|---|---|
| Number of observations | 9348 |
| Missing cells | 357 |
| Missing cells (%) | 0.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.2 MiB |
| Average record size in memory | 248.0 B |
Variable types
| BOOL | 17 |
|---|---|
| NUM | 14 |
Reproduction
| Analysis started | 2020-06-17 14:12:05.053211 |
|---|---|
| Analysis finished | 2020-06-17 14:12:38.134187 |
| Duration | 33.08 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
bodyCharCt is highly correlated with numLines | High correlation |
numLines is highly correlated with bodyCharCt | High correlation |
numRec has 282 (3.0%) missing values | Missing |
subExcCt is highly skewed (γ1 = 30.02121095) | Skewed |
numAtt is highly skewed (γ1 = 21.07630037) | Skewed |
numRec is highly skewed (γ1 = 27.10968379) | Skewed |
numDlr is highly skewed (γ1 = 59.26093156) | Skewed |
Unnamed: 0 has unique values | Unique |
subExcCt has 8498 (90.9%) zeros | Zeros |
subQuesCt has 8336 (89.2%) zeros | Zeros |
numAtt has 8782 (93.9%) zeros | Zeros |
hour has 268 (2.9%) zeros | Zeros |
perHTML has 8204 (87.8%) zeros | Zeros |
subBlanks has 233 (2.5%) zeros | Zeros |
forwards has 5592 (59.8%) zeros | Zeros |
numDlr has 7578 (81.1%) zeros | Zeros |
| Distinct count | 9348 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4673.5 |
|---|---|
| Minimum | 0 |
| Maximum | 9347 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 467.35 |
| Q1 | 2336.75 |
| median | 4673.5 |
| Q3 | 7010.25 |
| 95-th percentile | 8879.65 |
| Maximum | 9347 |
| Range | 9347 |
| Interquartile range (IQR) | 4673.5 |
Descriptive statistics
| Standard deviation | 2698.679492 |
|---|---|
| Coefficient of variation (CV) | 0.5774429211 |
| Kurtosis | -1.2 |
| Mean | 4673.5 |
| Median Absolute Deviation (MAD) | 2337 |
| Skewness | 0 |
| Sum | 43687878 |
| Variance | 7282871 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 3291 | 1 | < 0.1% | |
| 3323 | 1 | < 0.1% | |
| 1274 | 1 | < 0.1% | |
| 7417 | 1 | < 0.1% | |
| 5368 | 1 | < 0.1% | |
| 3315 | 1 | < 0.1% | |
| 1266 | 1 | < 0.1% | |
| 7409 | 1 | < 0.1% | |
| 5360 | 1 | < 0.1% | |
| 3307 | 1 | < 0.1% | |
| 1258 | 1 | < 0.1% | |
| 7401 | 1 | < 0.1% | |
| 5352 | 1 | < 0.1% | |
| 3299 | 1 | < 0.1% | |
| 1250 | 1 | < 0.1% | |
| 7393 | 1 | < 0.1% | |
| 5376 | 1 | < 0.1% | |
| 7425 | 1 | < 0.1% | |
| 1282 | 1 | < 0.1% | |
| 3347 | 1 | < 0.1% | |
| 7457 | 1 | < 0.1% | |
| 5408 | 1 | < 0.1% | |
| 3355 | 1 | < 0.1% | |
| 1306 | 1 | < 0.1% | |
| Other values (9323) | 9323 | 99.7% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9347 | 1 | < 0.1% | |
| 9346 | 1 | < 0.1% | |
| 9345 | 1 | < 0.1% | |
| 9344 | 1 | < 0.1% | |
| 9343 | 1 | < 0.1% | |
| 9342 | 1 | < 0.1% | |
| 9341 | 1 | < 0.1% | |
| 9340 | 1 | < 0.1% | |
| 9339 | 1 | < 0.1% | |
| 9338 | 1 | < 0.1% |
isSpam
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 6951 | 74.4% | |
| 1 | 2397 | 25.6% |
isRe
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 6343 | 67.9% | |
| 1 | 3005 | 32.1% |
underscore
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 126 |
| Value | Count | Frequency (%) | |
| 0 | 9222 | 98.7% | |
| 1 | 126 | 1.3% |
priority
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 54 |
| Value | Count | Frequency (%) | |
| 0 | 9294 | 99.4% | |
| 1 | 54 | 0.6% |
isInReplyTo
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 6556 | 70.1% | |
| 1 | 2792 | 29.9% |
sortedRec
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 1 | |
|---|---|
| 0 | 948 |
| Value | Count | Frequency (%) | |
| 1 | 8400 | 89.9% | |
| 0 | 948 | 10.1% |
subPunc
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 263 |
| Value | Count | Frequency (%) | |
| 0 | 9085 | 97.2% | |
| 1 | 263 | 2.8% |
multipartText
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 328 |
| Value | Count | Frequency (%) | |
| 0 | 9020 | 96.5% | |
| 1 | 328 | 3.5% |
hasImages
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 22 |
| Value | Count | Frequency (%) | |
| 0 | 9326 | 99.8% | |
| 1 | 22 | 0.2% |
isPGPsigned
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 176 |
| Value | Count | Frequency (%) | |
| 0 | 9172 | 98.1% | |
| 1 | 176 | 1.9% |
subSpamWords
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 7 |
| Missing (%) | 0.1% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 644 |
| (Missing) | 7 |
| Value | Count | Frequency (%) | |
| 0 | 8697 | 93.0% | |
| 1 | 644 | 6.9% | |
| (Missing) | 7 | 0.1% |
noHost
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 29 |
| (Missing) | 1 |
| Value | Count | Frequency (%) | |
| 0 | 9318 | 99.7% | |
| 1 | 29 | 0.3% | |
| (Missing) | 1 | < 0.1% |
numEnd
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 1139 |
| Value | Count | Frequency (%) | |
| 0 | 8209 | 87.8% | |
| 1 | 1139 | 12.2% |
isYelling
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 7 |
| Missing (%) | 0.1% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 207 |
| (Missing) | 7 |
| Value | Count | Frequency (%) | |
| 0 | 9134 | 97.7% | |
| 1 | 207 | 2.2% | |
| (Missing) | 7 | 0.1% |
isOrigMsg
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 360 |
| Value | Count | Frequency (%) | |
| 0 | 8988 | 96.1% | |
| 1 | 360 | 3.9% |
isDear
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 | 78 |
| Value | Count | Frequency (%) | |
| 0 | 9270 | 99.2% | |
| 1 | 78 | 0.8% |
isWrote
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 73.2 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 7442 | 79.6% | |
| 1 | 1906 | 20.4% |
| Distinct count | 457 |
|---|---|
| Unique (%) | 4.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 66.90853658536585 |
|---|---|
| Minimum | 2 |
| Maximum | 6319 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 7 |
| Q1 | 19 |
| median | 32 |
| Q3 | 59 |
| 95-th percentile | 258 |
| Maximum | 6319 |
| Range | 6317 |
| Interquartile range (IQR) | 40 |
Descriptive statistics
| Standard deviation | 147.9558858 |
|---|---|
| Coefficient of variation (CV) | 2.211315526 |
| Kurtosis | 706.0795696 |
| Mean | 66.90853659 |
| Median Absolute Deviation (MAD) | 17 |
| Skewness | 19.1572773 |
| Sum | 625461 |
| Variance | 21890.94413 |
| Value | Count | Frequency (%) | |
| 7 | 324 | 3.5% | |
| 6 | 234 | 2.5% | |
| 24 | 231 | 2.5% | |
| 8 | 210 | 2.2% | |
| 27 | 199 | 2.1% | |
| 30 | 192 | 2.1% | |
| 23 | 180 | 1.9% | |
| 32 | 180 | 1.9% | |
| 21 | 179 | 1.9% | |
| 28 | 177 | 1.9% | |
| 25 | 171 | 1.8% | |
| 31 | 166 | 1.8% | |
| 29 | 165 | 1.8% | |
| 26 | 165 | 1.8% | |
| 11 | 165 | 1.8% | |
| 22 | 165 | 1.8% | |
| 16 | 163 | 1.7% | |
| 34 | 163 | 1.7% | |
| 17 | 162 | 1.7% | |
| 13 | 161 | 1.7% | |
| 15 | 161 | 1.7% | |
| 19 | 158 | 1.7% | |
| 33 | 153 | 1.6% | |
| 20 | 147 | 1.6% | |
| 18 | 138 | 1.5% | |
| Other values (432) | 4839 | 51.8% |
| Value | Count | Frequency (%) | |
| 2 | 10 | 0.1% | |
| 3 | 4 | < 0.1% | |
| 4 | 13 | 0.1% | |
| 5 | 16 | 0.2% | |
| 6 | 234 | 2.5% | |
| 7 | 324 | 3.5% | |
| 8 | 210 | 2.2% | |
| 9 | 110 | 1.2% | |
| 10 | 109 | 1.2% | |
| 11 | 165 | 1.8% |
| Value | Count | Frequency (%) | |
| 6319 | 2 | < 0.1% | |
| 2523 | 1 | < 0.1% | |
| 1942 | 1 | < 0.1% | |
| 1699 | 2 | < 0.1% | |
| 1694 | 2 | < 0.1% | |
| 1684 | 2 | < 0.1% | |
| 1220 | 1 | < 0.1% | |
| 1197 | 1 | < 0.1% | |
| 1116 | 2 | < 0.1% | |
| 1112 | 2 | < 0.1% |
| Distinct count | 3236 |
|---|---|
| Unique (%) | 34.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2844.0914634146343 |
|---|---|
| Minimum | 6 |
| Maximum | 188505 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 192 |
| Q1 | 587 |
| median | 1088.5 |
| Q3 | 2192 |
| 95-th percentile | 11508.3 |
| Maximum | 188505 |
| Range | 188499 |
| Interquartile range (IQR) | 1605 |
Descriptive statistics
| Standard deviation | 6711.335668 |
|---|---|
| Coefficient of variation (CV) | 2.359746778 |
| Kurtosis | 177.0952111 |
| Mean | 2844.091463 |
| Median Absolute Deviation (MAD) | 628.5 |
| Skewness | 9.65440469 |
| Sum | 26586567 |
| Variance | 45042026.45 |
| Value | Count | Frequency (%) | |
| 572 | 23 | 0.2% | |
| 189 | 21 | 0.2% | |
| 151 | 20 | 0.2% | |
| 156 | 17 | 0.2% | |
| 574 | 17 | 0.2% | |
| 201 | 17 | 0.2% | |
| 810 | 16 | 0.2% | |
| 595 | 15 | 0.2% | |
| 200 | 15 | 0.2% | |
| 906 | 15 | 0.2% | |
| 393 | 15 | 0.2% | |
| 99 | 14 | 0.1% | |
| 250 | 14 | 0.1% | |
| 198 | 14 | 0.1% | |
| 815 | 14 | 0.1% | |
| 187 | 14 | 0.1% | |
| 525 | 14 | 0.1% | |
| 1027 | 14 | 0.1% | |
| 1140 | 14 | 0.1% | |
| 542 | 13 | 0.1% | |
| 386 | 13 | 0.1% | |
| 1109 | 13 | 0.1% | |
| 193 | 13 | 0.1% | |
| 1081 | 13 | 0.1% | |
| 487 | 13 | 0.1% | |
| Other values (3211) | 8967 | 95.9% |
| Value | Count | Frequency (%) | |
| 6 | 2 | < 0.1% | |
| 27 | 1 | < 0.1% | |
| 39 | 2 | < 0.1% | |
| 44 | 1 | < 0.1% | |
| 46 | 1 | < 0.1% | |
| 51 | 2 | < 0.1% | |
| 52 | 2 | < 0.1% | |
| 57 | 2 | < 0.1% | |
| 60 | 1 | < 0.1% | |
| 62 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 188505 | 2 | < 0.1% | |
| 124641 | 2 | < 0.1% | |
| 106489 | 1 | < 0.1% | |
| 86875 | 1 | < 0.1% | |
| 86336 | 1 | < 0.1% | |
| 86325 | 1 | < 0.1% | |
| 84017 | 2 | < 0.1% | |
| 71755 | 1 | < 0.1% | |
| 71447 | 2 | < 0.1% | |
| 67998 | 1 | < 0.1% |
| Distinct count | 8 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 20 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.13132504288164665 |
|---|---|
| Minimum | 0.0 |
| Maximum | 42.0 |
| Zeros | 8498 |
| Zeros (%) | 90.9% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 42 |
| Range | 42 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.6615645999 |
|---|---|
| Coefficient of variation (CV) | 5.037611908 |
| Kurtosis | 1740.670382 |
| Mean | 0.1313250429 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 30.02121095 |
| Sum | 1225 |
| Variance | 0.4376677198 |
| Value | Count | Frequency (%) | |
| 0 | 8498 | 90.9% | |
| 1 | 624 | 6.7% | |
| 2 | 122 | 1.3% | |
| 3 | 54 | 0.6% | |
| 4 | 13 | 0.1% | |
| 5 | 9 | 0.1% | |
| 8 | 7 | 0.1% | |
| 42 | 1 | < 0.1% | |
| (Missing) | 20 | 0.2% |
| Value | Count | Frequency (%) | |
| 0 | 8498 | 90.9% | |
| 1 | 624 | 6.7% | |
| 2 | 122 | 1.3% | |
| 3 | 54 | 0.6% | |
| 4 | 13 | 0.1% | |
| 5 | 9 | 0.1% | |
| 8 | 7 | 0.1% | |
| 42 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 42 | 1 | < 0.1% | |
| 8 | 7 | 0.1% | |
| 5 | 9 | 0.1% | |
| 4 | 13 | 0.1% | |
| 3 | 54 | 0.6% | |
| 2 | 122 | 1.3% | |
| 1 | 624 | 6.7% | |
| 0 | 8498 | 90.9% |
| Distinct count | 8 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 20 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.13775728987993138 |
|---|---|
| Minimum | 0.0 |
| Maximum | 12.0 |
| Zeros | 8336 |
| Zeros (%) | 89.2% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 12 |
| Range | 12 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.5076853212 |
|---|---|
| Coefficient of variation (CV) | 3.685360838 |
| Kurtosis | 99.23061573 |
| Mean | 0.1377572899 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 7.437372476 |
| Sum | 1285 |
| Variance | 0.2577443854 |
| Value | Count | Frequency (%) | |
| 0 | 8336 | 89.2% | |
| 1 | 870 | 9.3% | |
| 4 | 61 | 0.7% | |
| 2 | 42 | 0.4% | |
| 3 | 14 | 0.1% | |
| 12 | 2 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| (Missing) | 20 | 0.2% |
| Value | Count | Frequency (%) | |
| 0 | 8336 | 89.2% | |
| 1 | 870 | 9.3% | |
| 2 | 42 | 0.4% | |
| 3 | 14 | 0.1% | |
| 4 | 61 | 0.7% | |
| 5 | 1 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| 12 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 12 | 2 | < 0.1% | |
| 8 | 2 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 4 | 61 | 0.7% | |
| 3 | 14 | 0.1% | |
| 2 | 42 | 0.4% | |
| 1 | 870 | 9.3% | |
| 0 | 8336 | 89.2% |
| Distinct count | 6 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.06578947368421052 |
|---|---|
| Minimum | 0.0 |
| Maximum | 18.0 |
| Zeros | 8782 |
| Zeros (%) | 93.9% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 18 |
| Range | 18 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3248786054 |
|---|---|
| Coefficient of variation (CV) | 4.938154802 |
| Kurtosis | 1016.808563 |
| Mean | 0.06578947368 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 21.07630037 |
| Sum | 615 |
| Variance | 0.1055461082 |
| Value | Count | Frequency (%) | |
| 0 | 8782 | 93.9% | |
| 1 | 544 | 5.8% | |
| 2 | 17 | 0.2% | |
| 5 | 3 | < 0.1% | |
| 18 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 8782 | 93.9% | |
| 1 | 544 | 5.8% | |
| 2 | 17 | 0.2% | |
| 4 | 1 | < 0.1% | |
| 5 | 3 | < 0.1% | |
| 18 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 18 | 1 | < 0.1% | |
| 5 | 3 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 2 | 17 | 0.2% | |
| 1 | 544 | 5.8% | |
| 0 | 8782 | 93.9% |
| Distinct count | 51 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 282 |
| Missing (%) | 3.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.929406574012795 |
|---|---|
| Minimum | 0.0 |
| Maximum | 311.0 |
| Zeros | 92 |
| Zeros (%) | 1.0% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 4 |
| Maximum | 311 |
| Range | 311 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 5.242396014 |
|---|---|
| Coefficient of variation (CV) | 2.717102805 |
| Kurtosis | 1371.669611 |
| Mean | 1.929406574 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 27.10968379 |
| Sum | 17492 |
| Variance | 27.48271596 |
| Value | Count | Frequency (%) | |
| 1 | 6802 | 72.8% | |
| 2 | 1269 | 13.6% | |
| 3 | 345 | 3.7% | |
| 4 | 140 | 1.5% | |
| 0 | 92 | 1.0% | |
| 5 | 73 | 0.8% | |
| 10 | 48 | 0.5% | |
| 7 | 35 | 0.4% | |
| 8 | 34 | 0.4% | |
| 11 | 30 | 0.3% | |
| 12 | 24 | 0.3% | |
| 6 | 24 | 0.3% | |
| 9 | 17 | 0.2% | |
| 19 | 14 | 0.1% | |
| 14 | 13 | 0.1% | |
| 44 | 8 | 0.1% | |
| 15 | 7 | 0.1% | |
| 16 | 7 | 0.1% | |
| 48 | 7 | 0.1% | |
| 18 | 6 | 0.1% | |
| 21 | 5 | 0.1% | |
| 45 | 5 | 0.1% | |
| 13 | 5 | 0.1% | |
| 46 | 5 | 0.1% | |
| 32 | 4 | < 0.1% | |
| Other values (26) | 47 | 0.5% | |
| (Missing) | 282 | 3.0% |
| Value | Count | Frequency (%) | |
| 0 | 92 | 1.0% | |
| 1 | 6802 | 72.8% | |
| 2 | 1269 | 13.6% | |
| 3 | 345 | 3.7% | |
| 4 | 140 | 1.5% | |
| 5 | 73 | 0.8% | |
| 6 | 24 | 0.3% | |
| 7 | 35 | 0.4% | |
| 8 | 34 | 0.4% | |
| 9 | 17 | 0.2% |
| Value | Count | Frequency (%) | |
| 311 | 1 | < 0.1% | |
| 75 | 1 | < 0.1% | |
| 74 | 1 | < 0.1% | |
| 68 | 1 | < 0.1% | |
| 66 | 2 | < 0.1% | |
| 54 | 2 | < 0.1% | |
| 49 | 3 | < 0.1% | |
| 48 | 7 | 0.1% | |
| 47 | 3 | < 0.1% | |
| 46 | 5 | 0.1% |
perCaps
Real number (ℝ≥0)
| Distinct count | 5201 |
|---|---|
| Unique (%) | 55.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.850370586877977 |
|---|---|
| Minimum | 0.0 |
| Maximum | 100.0 |
| Zeros | 9 |
| Zeros (%) | 0.1% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2.504472272 |
| Q1 | 4.255319149 |
| median | 6.055473246 |
| Q3 | 9.059398644 |
| 95-th percentile | 26.46754047 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 4.804079495 |
Descriptive statistics
| Standard deviation | 9.58341544 |
|---|---|
| Coefficient of variation (CV) | 1.082826459 |
| Kurtosis | 21.6732258 |
| Mean | 8.850370587 |
| Median Absolute Deviation (MAD) | 2.174075947 |
| Skewness | 4.014130081 |
| Sum | 82733.26425 |
| Variance | 91.8418515 |
| Value | Count | Frequency (%) | |
| 6.666666667 | 24 | 0.3% | |
| 5.982905983 | 22 | 0.2% | |
| 11.11111111 | 22 | 0.2% | |
| 4.385964912 | 20 | 0.2% | |
| 7.142857143 | 20 | 0.2% | |
| 5.882352941 | 19 | 0.2% | |
| 7.692307692 | 15 | 0.2% | |
| 10.41666667 | 15 | 0.2% | |
| 12.5 | 14 | 0.1% | |
| 4.225352113 | 14 | 0.1% | |
| 5.555555556 | 13 | 0.1% | |
| 4 | 13 | 0.1% | |
| 6.060606061 | 12 | 0.1% | |
| 14.98289624 | 12 | 0.1% | |
| 3.571428571 | 12 | 0.1% | |
| 4.347826087 | 11 | 0.1% | |
| 6.818181818 | 11 | 0.1% | |
| 3.846153846 | 11 | 0.1% | |
| 5 | 11 | 0.1% | |
| 4.545454545 | 11 | 0.1% | |
| 5.263157895 | 10 | 0.1% | |
| 5.333333333 | 10 | 0.1% | |
| 12.04819277 | 10 | 0.1% | |
| 6.303724928 | 10 | 0.1% | |
| 4.62633452 | 10 | 0.1% | |
| Other values (5176) | 8996 | 96.2% |
| Value | Count | Frequency (%) | |
| 0 | 9 | 0.1% | |
| 0.3891050584 | 1 | < 0.1% | |
| 0.4784688995 | 1 | < 0.1% | |
| 0.5424954792 | 1 | < 0.1% | |
| 0.5545286506 | 1 | < 0.1% | |
| 0.7125890736 | 1 | < 0.1% | |
| 0.7936507937 | 2 | < 0.1% | |
| 0.8038585209 | 2 | < 0.1% | |
| 0.8333333333 | 1 | < 0.1% | |
| 0.8403361345 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 100 | 5 | 0.1% | |
| 99.15396058 | 1 | < 0.1% | |
| 98.02283486 | 1 | < 0.1% | |
| 97.9410128 | 1 | < 0.1% | |
| 97.56715661 | 1 | < 0.1% | |
| 96.54178674 | 2 | < 0.1% | |
| 96.27906977 | 1 | < 0.1% | |
| 95.60193813 | 1 | < 0.1% | |
| 93.87008234 | 1 | < 0.1% | |
| 85.73667712 | 1 | < 0.1% |
| Distinct count | 24 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.210847240051347 |
|---|---|
| Minimum | 0.0 |
| Maximum | 23.0 |
| Zeros | 268 |
| Zeros (%) | 2.9% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 8 |
| median | 13 |
| Q3 | 18 |
| 95-th percentile | 22 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 6.623932056 |
|---|---|
| Coefficient of variation (CV) | 0.5424629369 |
| Kurtosis | -1.092614847 |
| Mean | 12.21084724 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | -0.09694558009 |
| Sum | 114147 |
| Variance | 43.87647588 |
| Value | Count | Frequency (%) | |
| 8 | 1386 | 14.8% | |
| 15 | 589 | 6.3% | |
| 18 | 446 | 4.8% | |
| 19 | 437 | 4.7% | |
| 16 | 423 | 4.5% | |
| 22 | 412 | 4.4% | |
| 20 | 409 | 4.4% | |
| 23 | 407 | 4.4% | |
| 2 | 406 | 4.3% | |
| 21 | 405 | 4.3% | |
| 14 | 399 | 4.3% | |
| 17 | 393 | 4.2% | |
| 13 | 371 | 4.0% | |
| 11 | 320 | 3.4% | |
| 10 | 311 | 3.3% | |
| 9 | 305 | 3.3% | |
| 1 | 280 | 3.0% | |
| 3 | 270 | 2.9% | |
| 0 | 268 | 2.9% | |
| 12 | 243 | 2.6% | |
| 4 | 236 | 2.5% | |
| 5 | 222 | 2.4% | |
| 7 | 218 | 2.3% | |
| 6 | 192 | 2.1% |
| Value | Count | Frequency (%) | |
| 0 | 268 | 2.9% | |
| 1 | 280 | 3.0% | |
| 2 | 406 | 4.3% | |
| 3 | 270 | 2.9% | |
| 4 | 236 | 2.5% | |
| 5 | 222 | 2.4% | |
| 6 | 192 | 2.1% | |
| 7 | 218 | 2.3% | |
| 8 | 1386 | 14.8% | |
| 9 | 305 | 3.3% |
| Value | Count | Frequency (%) | |
| 23 | 407 | 4.4% | |
| 22 | 412 | 4.4% | |
| 21 | 405 | 4.3% | |
| 20 | 409 | 4.4% | |
| 19 | 437 | 4.7% | |
| 18 | 446 | 4.8% | |
| 17 | 393 | 4.2% | |
| 16 | 423 | 4.5% | |
| 15 | 589 | 6.3% | |
| 14 | 399 | 4.3% |
| Distinct count | 885 |
|---|---|
| Unique (%) | 9.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.5170821409935185 |
|---|---|
| Minimum | 0.0 |
| Maximum | 100.0 |
| Zeros | 8204 |
| Zeros (%) | 87.8% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 59.57621835 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 19.13526602 |
|---|---|
| Coefficient of variation (CV) | 2.936170759 |
| Kurtosis | 7.927976454 |
| Mean | 6.517082141 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.990762235 |
| Sum | 60921.68385 |
| Variance | 366.1584056 |
| Value | Count | Frequency (%) | |
| 0 | 8204 | 87.8% | |
| 86.39674379 | 8 | 0.1% | |
| 30.2303263 | 6 | 0.1% | |
| 23.67890903 | 5 | 0.1% | |
| 35.09183275 | 5 | 0.1% | |
| 23.75690608 | 5 | 0.1% | |
| 32.67086767 | 4 | < 0.1% | |
| 47.5526075 | 4 | < 0.1% | |
| 35.96526656 | 4 | < 0.1% | |
| 41.74392383 | 4 | < 0.1% | |
| 58.3497053 | 4 | < 0.1% | |
| 63.16162571 | 4 | < 0.1% | |
| 45.36500579 | 4 | < 0.1% | |
| 62.18655968 | 4 | < 0.1% | |
| 48.76794515 | 4 | < 0.1% | |
| 66.72920575 | 4 | < 0.1% | |
| 32.24409449 | 4 | < 0.1% | |
| 17.97268152 | 4 | < 0.1% | |
| 50.17459038 | 4 | < 0.1% | |
| 100 | 4 | < 0.1% | |
| 36.07032058 | 4 | < 0.1% | |
| 13.14662497 | 3 | < 0.1% | |
| 28.93112718 | 3 | < 0.1% | |
| 24.06779661 | 3 | < 0.1% | |
| 64.94949495 | 3 | < 0.1% | |
| Other values (860) | 1043 | 11.2% |
| Value | Count | Frequency (%) | |
| 0 | 8204 | 87.8% | |
| 7.852882704 | 2 | < 0.1% | |
| 8.474576271 | 2 | < 0.1% | |
| 8.920454545 | 1 | < 0.1% | |
| 10.38154392 | 1 | < 0.1% | |
| 10.89987326 | 1 | < 0.1% | |
| 11.05807109 | 1 | < 0.1% | |
| 11.28880527 | 1 | < 0.1% | |
| 11.29411765 | 1 | < 0.1% | |
| 11.84065234 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 100 | 4 | < 0.1% | |
| 98.83040936 | 1 | < 0.1% | |
| 98.24478178 | 1 | < 0.1% | |
| 97.5743349 | 1 | < 0.1% | |
| 97.34375 | 1 | < 0.1% | |
| 96.16037336 | 1 | < 0.1% | |
| 96.05749921 | 1 | < 0.1% | |
| 94.05594406 | 1 | < 0.1% | |
| 93.95080877 | 2 | < 0.1% | |
| 93.81582273 | 1 | < 0.1% |
| Distinct count | 546 |
|---|---|
| Unique (%) | 5.9% |
| Missing | 20 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.866939237078649 |
|---|---|
| Minimum | 0.0 |
| Maximum | 86.41975308641977 |
| Zeros | 233 |
| Zeros (%) | 2.5% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5.555555556 |
| Q1 | 10.52631579 |
| median | 13.25301205 |
| Q3 | 15.68627451 |
| 95-th percentile | 21.42857143 |
| Maximum | 86.41975309 |
| Range | 86.41975309 |
| Interquartile range (IQR) | 5.15995872 |
Descriptive statistics
| Standard deviation | 7.431937546 |
|---|---|
| Coefficient of variation (CV) | 0.5359464997 |
| Kurtosis | 18.46340608 |
| Mean | 13.86693924 |
| Median Absolute Deviation (MAD) | 2.538726334 |
| Skewness | 3.335897406 |
| Sum | 129350.8092 |
| Variance | 55.23369569 |
| Value | Count | Frequency (%) | |
| 14.28571429 | 468 | 5.0% | |
| 12.5 | 398 | 4.3% | |
| 10 | 298 | 3.2% | |
| 16.66666667 | 253 | 2.7% | |
| 11.11111111 | 251 | 2.7% | |
| 9.090909091 | 247 | 2.6% | |
| 0 | 233 | 2.5% | |
| 15.38461538 | 208 | 2.2% | |
| 13.33333333 | 190 | 2.0% | |
| 11.76470588 | 161 | 1.7% | |
| 13.63636364 | 151 | 1.6% | |
| 12.12121212 | 150 | 1.6% | |
| 17.64705882 | 118 | 1.3% | |
| 15.78947368 | 110 | 1.2% | |
| 18.18181818 | 107 | 1.1% | |
| 15 | 105 | 1.1% | |
| 13.79310345 | 104 | 1.1% | |
| 14.70588235 | 103 | 1.1% | |
| 12 | 102 | 1.1% | |
| 14.81481481 | 102 | 1.1% | |
| 8.333333333 | 101 | 1.1% | |
| 10.52631579 | 100 | 1.1% | |
| 6.666666667 | 100 | 1.1% | |
| 11.53846154 | 97 | 1.0% | |
| 9.523809524 | 95 | 1.0% | |
| Other values (521) | 4976 | 53.2% |
| Value | Count | Frequency (%) | |
| 0 | 233 | 2.5% | |
| 1.333333333 | 2 | < 0.1% | |
| 1.785714286 | 2 | < 0.1% | |
| 2.43902439 | 4 | < 0.1% | |
| 2.857142857 | 3 | < 0.1% | |
| 2.898550725 | 1 | < 0.1% | |
| 3.03030303 | 2 | < 0.1% | |
| 3.225806452 | 3 | < 0.1% | |
| 3.333333333 | 1 | < 0.1% | |
| 3.370786517 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 86.41975309 | 1 | < 0.1% | |
| 84.61538462 | 1 | < 0.1% | |
| 84.42211055 | 1 | < 0.1% | |
| 77.6119403 | 1 | < 0.1% | |
| 74.28571429 | 1 | < 0.1% | |
| 71.92982456 | 1 | < 0.1% | |
| 70.68965517 | 2 | < 0.1% | |
| 69.13580247 | 2 | < 0.1% | |
| 65.625 | 7 | 0.1% | |
| 65.51724138 | 1 | < 0.1% |
| Distinct count | 853 |
|---|---|
| Unique (%) | 9.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.445085866326238 |
|---|---|
| Minimum | 0.0 |
| Maximum | 99.0582695703355 |
| Zeros | 5592 |
| Zeros (%) | 59.8% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 15.38461538 |
| 95-th percentile | 52 |
| Maximum | 99.05826957 |
| Range | 99.05826957 |
| Interquartile range (IQR) | 15.38461538 |
Descriptive statistics
| Standard deviation | 18.26357585 |
|---|---|
| Coefficient of variation (CV) | 1.748532858 |
| Kurtosis | 3.596424676 |
| Mean | 10.44508587 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.997014098 |
| Sum | 97640.66268 |
| Variance | 333.5582027 |
| Value | Count | Frequency (%) | |
| 0 | 5592 | 59.8% | |
| 20 | 60 | 0.6% | |
| 33.33333333 | 58 | 0.6% | |
| 25 | 57 | 0.6% | |
| 50 | 52 | 0.6% | |
| 12.5 | 48 | 0.5% | |
| 14.28571429 | 48 | 0.5% | |
| 11.11111111 | 40 | 0.4% | |
| 16.66666667 | 36 | 0.4% | |
| 13.33333333 | 34 | 0.4% | |
| 40 | 32 | 0.3% | |
| 10 | 31 | 0.3% | |
| 9.090909091 | 31 | 0.3% | |
| 22.22222222 | 29 | 0.3% | |
| 30 | 28 | 0.3% | |
| 18.18181818 | 27 | 0.3% | |
| 28.57142857 | 26 | 0.3% | |
| 17.64705882 | 25 | 0.3% | |
| 15 | 25 | 0.3% | |
| 18.75 | 24 | 0.3% | |
| 27.27272727 | 24 | 0.3% | |
| 5 | 24 | 0.3% | |
| 8.333333333 | 24 | 0.3% | |
| 13.04347826 | 24 | 0.3% | |
| 6.666666667 | 23 | 0.2% | |
| Other values (828) | 2926 | 31.3% |
| Value | Count | Frequency (%) | |
| 0 | 5592 | 59.8% | |
| 0.01582528881 | 2 | < 0.1% | |
| 0.146627566 | 1 | < 0.1% | |
| 0.1773049645 | 2 | < 0.1% | |
| 0.1828153565 | 2 | < 0.1% | |
| 0.1956947162 | 2 | < 0.1% | |
| 0.2178649237 | 1 | < 0.1% | |
| 0.2406738869 | 1 | < 0.1% | |
| 0.2624671916 | 2 | < 0.1% | |
| 0.2645502646 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 99.05826957 | 2 | < 0.1% | |
| 98.36400818 | 3 | < 0.1% | |
| 98.05194805 | 2 | < 0.1% | |
| 95.40229885 | 1 | < 0.1% | |
| 94.27312775 | 1 | < 0.1% | |
| 93.89312977 | 4 | < 0.1% | |
| 92.61603376 | 2 | < 0.1% | |
| 91.82692308 | 1 | < 0.1% | |
| 91.75257732 | 2 | < 0.1% | |
| 91.73553719 | 1 | < 0.1% |
avgWordLen
Real number (ℝ≥0)
| Distinct count | 5020 |
|---|---|
| Unique (%) | 53.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.487221791442675 |
|---|---|
| Minimum | 1.3630718540108986 |
| Maximum | 26.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 1.363071854 |
|---|---|
| 5-th percentile | 3.822419128 |
| Q1 | 4.208256552 |
| median | 4.454545455 |
| Q3 | 4.728506787 |
| 95-th percentile | 5.225085079 |
| Maximum | 26 |
| Range | 24.63692815 |
| Interquartile range (IQR) | 0.5202502353 |
Descriptive statistics
| Standard deviation | 0.568582053 |
|---|---|
| Coefficient of variation (CV) | 0.1267113772 |
| Kurtosis | 227.7112775 |
| Mean | 4.487221791 |
| Median Absolute Deviation (MAD) | 0.2590357305 |
| Skewness | 6.643624729 |
| Sum | 41946.54931 |
| Variance | 0.323285551 |
| Value | Count | Frequency (%) | |
| 5 | 67 | 0.7% | |
| 4.5 | 49 | 0.5% | |
| 4 | 36 | 0.4% | |
| 4.6 | 24 | 0.3% | |
| 4.875 | 24 | 0.3% | |
| 4.166666667 | 24 | 0.3% | |
| 4.833333333 | 23 | 0.2% | |
| 5.181818182 | 21 | 0.2% | |
| 4.333333333 | 21 | 0.2% | |
| 4.2 | 21 | 0.2% | |
| 4.8 | 17 | 0.2% | |
| 4.611111111 | 16 | 0.2% | |
| 4.3 | 16 | 0.2% | |
| 4.75 | 16 | 0.2% | |
| 4.818181818 | 15 | 0.2% | |
| 4.1 | 14 | 0.1% | |
| 4.615384615 | 14 | 0.1% | |
| 4.25 | 13 | 0.1% | |
| 4.285714286 | 13 | 0.1% | |
| 4.4 | 13 | 0.1% | |
| 4.666666667 | 13 | 0.1% | |
| 4.291666667 | 12 | 0.1% | |
| 4.235294118 | 12 | 0.1% | |
| 4.929824561 | 12 | 0.1% | |
| 5.333333333 | 12 | 0.1% | |
| Other values (4995) | 8830 | 94.5% |
| Value | Count | Frequency (%) | |
| 1.363071854 | 1 | < 0.1% | |
| 1.375862069 | 1 | < 0.1% | |
| 1.377539287 | 1 | < 0.1% | |
| 1.395973154 | 2 | < 0.1% | |
| 1.406744446 | 2 | < 0.1% | |
| 1.463157895 | 6 | 0.1% | |
| 1.5 | 2 | < 0.1% | |
| 1.5125 | 2 | < 0.1% | |
| 1.719934774 | 2 | < 0.1% | |
| 1.7833764 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 26 | 1 | < 0.1% | |
| 9.433891993 | 1 | < 0.1% | |
| 9.038461538 | 2 | < 0.1% | |
| 9.037267081 | 1 | < 0.1% | |
| 8.747368421 | 2 | < 0.1% | |
| 8.194366197 | 2 | < 0.1% | |
| 8.18134715 | 2 | < 0.1% | |
| 8.167630058 | 2 | < 0.1% | |
| 8.162911612 | 2 | < 0.1% | |
| 8.15407855 | 2 | < 0.1% |
| Distinct count | 56 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7815575524176295 |
|---|---|
| Minimum | 0 |
| Maximum | 1977 |
| Zeros | 7578 |
| Zeros (%) | 81.1% |
| Memory size | 73.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 5 |
| Maximum | 1977 |
| Range | 1977 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 30.3804554 |
|---|---|
| Coefficient of variation (CV) | 17.05274991 |
| Kurtosis | 3825.174072 |
| Mean | 1.781557552 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 59.26093156 |
| Sum | 16654 |
| Variance | 922.9720702 |
| Value | Count | Frequency (%) | |
| 0 | 7578 | 81.1% | |
| 1 | 568 | 6.1% | |
| 2 | 409 | 4.4% | |
| 3 | 202 | 2.2% | |
| 4 | 120 | 1.3% | |
| 5 | 76 | 0.8% | |
| 6 | 67 | 0.7% | |
| 7 | 38 | 0.4% | |
| 8 | 26 | 0.3% | |
| 14 | 25 | 0.3% | |
| 10 | 24 | 0.3% | |
| 16 | 20 | 0.2% | |
| 12 | 20 | 0.2% | |
| 13 | 20 | 0.2% | |
| 9 | 18 | 0.2% | |
| 19 | 16 | 0.2% | |
| 17 | 11 | 0.1% | |
| 11 | 10 | 0.1% | |
| 159 | 7 | 0.1% | |
| 162 | 6 | 0.1% | |
| 45 | 5 | 0.1% | |
| 15 | 5 | 0.1% | |
| 27 | 5 | 0.1% | |
| 34 | 4 | < 0.1% | |
| 23 | 4 | < 0.1% | |
| Other values (31) | 64 | 0.7% |
| Value | Count | Frequency (%) | |
| 0 | 7578 | 81.1% | |
| 1 | 568 | 6.1% | |
| 2 | 409 | 4.4% | |
| 3 | 202 | 2.2% | |
| 4 | 120 | 1.3% | |
| 5 | 76 | 0.8% | |
| 6 | 67 | 0.7% | |
| 7 | 38 | 0.4% | |
| 8 | 26 | 0.3% | |
| 9 | 18 | 0.2% |
| Value | Count | Frequency (%) | |
| 1977 | 2 | < 0.1% | |
| 218 | 2 | < 0.1% | |
| 181 | 2 | < 0.1% | |
| 180 | 1 | < 0.1% | |
| 176 | 1 | < 0.1% | |
| 162 | 6 | 0.1% | |
| 159 | 7 | 0.1% | |
| 148 | 2 | < 0.1% | |
| 144 | 1 | < 0.1% | |
| 138 | 2 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| Unnamed: 0 | isSpam | isRe | underscore | priority | isInReplyTo | sortedRec | subPunc | multipartText | hasImages | isPGPsigned | subSpamWords | noHost | numEnd | isYelling | isOrigMsg | isDear | isWrote | numLines | bodyCharCt | subExcCt | subQuesCt | numAtt | numRec | perCaps | hour | perHTML | subBlanks | forwards | avgWordLen | numDlr | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 50 | 1554 | 0.0 | 0.0 | 0.0 | 2.0 | 4.451039 | 11.0 | 0.0 | 12.500000 | 0.000000 | 4.376623 | 3 |
| 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 26 | 873 | 0.0 | 0.0 | 0.0 | 1.0 | 7.491289 | 11.0 | 0.0 | 8.000000 | 0.000000 | 4.555556 | 0 |
| 2 | 2 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 38 | 1713 | 0.0 | 0.0 | 0.0 | 1.0 | 7.436096 | 12.0 | 0.0 | 8.000000 | 0.000000 | 4.817164 | 0 |
| 3 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 32 | 1095 | 0.0 | 0.0 | 0.0 | 0.0 | 5.090909 | 13.0 | 0.0 | 18.918919 | 3.125000 | 4.714286 | 0 |
| 4 | 4 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 31 | 1021 | 0.0 | 0.0 | 0.0 | 1.0 | 6.116643 | 13.0 | 0.0 | 15.217391 | 6.451613 | 4.234940 | 0 |
| 5 | 5 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 25 | 718 | 0.0 | 0.0 | 0.0 | 1.0 | 7.625272 | 13.0 | 0.0 | 15.217391 | 12.000000 | 3.956897 | 0 |
| 6 | 6 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 38 | 1288 | 0.0 | 0.0 | 0.0 | 1.0 | 6.343714 | 13.0 | 0.0 | 17.021277 | 0.000000 | 4.051402 | 0 |
| 7 | 7 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 1 | 39 | 1182 | 0.0 | 0.0 | 0.0 | 1.0 | 6.617647 | 14.0 | 0.0 | 15.217391 | 12.820513 | 4.039604 | 0 |
| 8 | 8 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 126 | 5989 | 0.0 | 0.0 | 0.0 | 1.0 | 3.161361 | 14.0 | 0.0 | 6.250000 | 0.000000 | 4.405222 | 0 |
| 9 | 9 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 50 | 1554 | 0.0 | 0.0 | 0.0 | 2.0 | 4.451039 | 11.0 | 0.0 | 12.500000 | 0.000000 | 4.376623 | 3 |
Last rows
| Unnamed: 0 | isSpam | isRe | underscore | priority | isInReplyTo | sortedRec | subPunc | multipartText | hasImages | isPGPsigned | subSpamWords | noHost | numEnd | isYelling | isOrigMsg | isDear | isWrote | numLines | bodyCharCt | subExcCt | subQuesCt | numAtt | numRec | perCaps | hour | perHTML | subBlanks | forwards | avgWordLen | numDlr | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9338 | 9338 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 1 | 0.0 | 0 | 0 | 0 | 196 | 10810 | 0.0 | 0.0 | 0.0 | 1.0 | 4.622030 | 18.0 | 83.955918 | 14.285714 | 0.000000 | 4.367925 | 5 |
| 9339 | 9339 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 72 | 3388 | 1.0 | 0.0 | 0.0 | 1.0 | 48.856624 | 12.0 | 0.000000 | 10.000000 | 0.000000 | 7.046036 | 45 |
| 9340 | 9340 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 123 | 4258 | 0.0 | 0.0 | 0.0 | 1.0 | 18.228974 | 16.0 | 60.005171 | 12.903226 | 0.000000 | 4.290938 | 0 |
| 9341 | 9341 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 1 | 0.0 | 0 | 0 | 0 | 121 | 5945 | 0.0 | 0.0 | 0.0 | 1.0 | 25.321543 | 19.0 | 37.323375 | 3.719008 | 0.826446 | 4.151279 | 2 |
| 9342 | 9342 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1.0 | 0.0 | 1 | 0.0 | 0 | 0 | 0 | 15 | 461 | 0.0 | 0.0 | 0.0 | 1.0 | 2.868852 | 21.0 | 0.000000 | 14.285714 | 0.000000 | 4.692308 | 0 |
| 9343 | 9343 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 245 | 14841 | 1.0 | 1.0 | 0.0 | 1.0 | 8.572552 | 21.0 | 79.657111 | 13.793103 | 0.000000 | 4.700555 | 2 |
| 9344 | 9344 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 58 | 1288 | 0.0 | 0.0 | 1.0 | 1.0 | 9.436009 | 23.0 | 0.000000 | 10.526316 | 0.000000 | 4.904255 | 4 |
| 9345 | 9345 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 53 | 2420 | 0.0 | 0.0 | 0.0 | 1.0 | 2.418448 | 8.0 | 0.000000 | 20.000000 | 0.000000 | 4.703704 | 0 |
| 9346 | 9346 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 0 | 0.0 | 0 | 0 | 0 | 279 | 23502 | 0.0 | 0.0 | 0.0 | 1.0 | 7.795400 | 23.0 | 0.000000 | 5.263158 | 0.000000 | 5.252690 | 80 |
| 9347 | 9347 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0.0 | 0.0 | 1 | 1.0 | 0 | 0 | 0 | 35 | 2516 | 0.0 | 0.0 | 0.0 | 1.0 | 4.783951 | 6.0 | 0.000000 | 14.285714 | 0.000000 | 4.823821 | 1 |